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cDNA encoding stem cell growth factor (SCGF; 245 
aa), a novel human growth factor for primitive hemato- 
poietic progenitor cells, has been previously reported 
(Hiraoka, A., Sugimura, A., Seki, T., Nagasawa, T., Ohta, 
N., Shimonishi, M., Hagiya, M. and Shimizu, S. Proc. 
Natl Acad. ScL USA 94, 7577-7582, 1997). Here we re- 
port the cloning and characterization of a fulMength 
SCGF cDNA. This protein consists of 323, 328 and 328 
aa in the human, murine and rat forms, the latter two 
of which share 85.1% and 83.3% aa identity, and 90.4% 
and 90.4% aa similarity to the human protein, respec- 
tively. Because the newly identified human clone en- 
codes the protein longer by 78 aa than that previously 
identified, we term the longer clone as hSCGF-a and 
the shorter one as hSCGF-/J. The computer-assisted ho- 
mology search reveals that SCGF is a new member of 
the C-type lectin superfamily, and that SCGF shows the 
greatest homology to tetranectin among the members 
of the family (27.2-33.7% aa identity and 46.0-53.6% aa 
similarity). SCGF transcripts are detected in spleen, 
thymus, appendix, bone marrow and fetal liver. Fluo- 
rescent in situ hybridization mapping indicates that 
the SCGF gene is located on chromosome 19 at position 
ql3.3 for human form and on chromosome 7 at position 
B3-B5 for murine form, which are close to Ok-2/flt3 li- 
gand and interleukin-11 genes of both human and mu- 
rine Species. © 1998 Academic Press 



A cDNA for novel hematopoietic cytokine, stem cell 
growth factor (SCGF), is originally isolated from cDNA 
library prepared from a human myeloid cell line, KPB- 
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M15, derived from chronic myelogenous leukemia in 
blast crisis (1,2). SCGF cDNA encodes 245 aa protein 
without N-linked glycosylation. No significant homol- 
ogy with the database in the EMBL, GenBank and 
Swiss-Prot is found for the SCGF nucleotide and amino 
acid sequence. SCGF mRNA is expressed by myeloid 
and stromal cells, but not by lymphoid cells. 

Primitive hematopoietic progenitor cells are stimu- 
lated by synergistic action of multiple growth factors, 
including colony-stimulating factors (CSFs) and in- 
terleukins (ILs); in particular, such early acting cyto- 
kines as stem cell factor (SCF or c-klt ligand) (3-5) and 
flk-2/flt3 ligand (6) play a major role. Human SCGF 
alone does not exhibit colony-stimulating activity, but 
burst-promoting activity and granulocyte/macrophage 
(GM)-promoting activity for erythroid and GM progeni- 
tor cells in primary agar culture with erythropoietin 
and GM-CSF, respectively. It further supports survival 
or growth of hematopoietic progenitor cells through a 
short-term liquid culture of human bone marrow cells. 
Consequently, SCGF could mediate an interaction be- 
tween primitive hematopoietic progenitor and stromal 
cells within the hematopoietic microenvironment, in 
conjunction with other growth factors. 

In this paper, we isolate and characterize full-length 
human (h), murine (m) and rat (r) SCGF cDNAs, indi- 
cating that SCGF is a new member of the C-type lectin 
superfamily, and that its mRNA is exclusively ex- 
pressed within the hematopoietic tissues. 

MATERIALS AND METHODS 

DDBJ/EMBL/GenBank accession numbers. The accession num- 
bers for the sequences reported in this paper are AB009244 for 
hSCGF-a, AB009245 for mSCGF and AB009246 for rSCGF. 

Cloning of the SCGF cDNA. Oligonucletides 5'- GAGTCCAGCT 
TAATGCAGGC A -3' and 5'-CTAGAAGGGG AACTCGCAGA C-3' 
were carried on PCR to amplify the entire length of hSCGF with the 
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template of first strand cDNA synthesized from human bone marrow 
poly(A) RNA (Clontech, Palo Alto, CA). The thermal cycling reaction 
was performed in the presence of 10% DMSO with ExTaq polymerase 
.(Toyobo, Tokyo, Japan); denaturation at 94°C for 1 min, annealing 
at 55°C for 2 min, and extension at 72°C for 2 min. The cDNA library 
derived from murine pre-adipocyte cell line, MC3T3-G2/PA6 (Riken, 
RCB1127) (7), was screened to isolate the clone containing the se- 
quence that could be amplified under the same PCR condition as for 
hSCGF. rSCGF cDNA was amplified with the primers containing 
oligonucleotides 5'-ATTTGGGTGC TGGGAAGCCC AGCT -3' and 
5'- TCCTGGGCAG AGACCGGTTC TCTA -3', sequences of which 
were based upon 5' and 3' untranslated region of mSCGF, respec- 
tively. The template for PCR was synthesized from total RNA of rat 
osteosarcoma cell line, ROS-17/2.8-5 (Riken, RCB0462). The cloned 
DNA was subcloned into pBluescript II SK{+) vector to create pBlue- 
script-SCGF and sequenced using DNA sequencer model 4000 (LI- 
COR Inc., Lincoln, Neb) (8). 

Homology search. The deduced amino acid sequence of SCGF was 
subjected to homology search with the database of Swiss-Prot and 
PIR, using a program of BLASTP1.4.9 (9). Motif sequence analysis 
was performed with the database of PROSITE and BLOCKS using 
MacPattern 3.4 (10). A multiple alignment was operated for amino 
acid sequences using the CLUSTAL W Multiple Sequence Alignment 
Program 1.6 (11), and an unrooted phylogenetic tree was visualized 
by TreeView 1.4. Signal sequence cleavage sites were analyzed by 
AnalyzeSignalase, ver. 2.03 (12), and transmembrane regions were 
by TopPred II, ver. 1.3 (13). 

RNA hybridization. Northern blot analysis of RNA samples was 
performed using Clontech human multiple tissue Northern blots im- 
mune systems. Antisense RNA probe was generated using T3 RNA 
polymerase and pBluescript-SCGF linearized with Hind III. 

Fluorescent in Situ hybridization (FISH) mapping. The proce- 
dure for FISH detection was performed according to the method pre- 
viously reported (14,15). Briefly, slides were baked at 55°C for 1 
hour. After RNase treatment, the slides were denatured with 70% 
formamide in 2 X SSC at 70°C for 2 min. followed by dehydration 
with ethanol. Probes were denatured at 75°C for 5 min, in a hybrid- 
ization mixture containing 50% formamide and 10% dextran sulfate. 
Probes were loaded onto the denatured chromosomal slides. After 
overnight hybridization, slides were washed and detected as well as 
amplified. FISH signals and the DAPI banding pattern was recorded 
separately by taking photographs, and the assignment of the FISH 
mapping data with chromosomal bands was achieved by superimpos- 
ing FISH signals with DAPI banded chromosomes (15). 



GAAGCTGGCAGAAGAAGGTCAAGGGGCTTGTGAGCTGCCCACCAGACTGGGACACTTGCT 60 

AGGTCTATACAGCAGTCCTACCCCTGGCATTCTGACCTCTCTACTATTTGGGTGCTGGGA 120 

AGCCCAGCTGGATGCAGGCAGCCT6GCTCTTGG6GGCCCTAGTGGTCCCTCAGCTTTTGA 180 

MQAAWLL6ALVVPQLL 16 

GTTTTGGTCATGGAGCCCGAGGTCCTGGGAGGGAGTGGGAGGGAGGCTGGG6AGGTGCCC 240 

SFGHGARG PGREWE6GWGGA 36 

TGGAGGAGGAGAGAGAGCGGGAGTCACAGATGTTGAAGAATCTCCAGGAGGCCCTAGGGC 300 

LEEERERESQMLKNLQEALG 56 

TGCCCACTGGGGTGGGAAATGAGGATAATCTTGCTGAAAACCCTGAAGACAAAGAGGTCT 360 

LPTGVGNEONLAENP EDKEV 76 

GGGAGACCACAGAGACTCAAGGGGAAGAAGAGGAAGAGGAAATCACCACAGCACCTTCTT 420 

WE.T.TETQ6. EEEEEE I TTAPS 96 

CTAGTCCCAACCCTTTCCCCAGCCCTTCTCCCACACCAGAGGACACTGTCACTTACATCT 480 

SSPNPFPSPSPT PEDTVTYI 116 

TGGGCCGCTTGGCCAGCCTCGATGCAGGCCTACACCAATTGCACGTCCGTCTGCACGTTT 540 

LGRLASLDAGLHOLHVRLHV 136 

TGGACACCCGTGTGGTTGAGCTGACCCAGGGGCTGCGGCAGCTGCGGGATGCTGCGAGTG 600 

LDTRVVELTQGLRQLRDAAS 156 

ACACCCGCGACTCAGTGCAAGCCCTGAAGGAGGTCCAGGACCGTGCTGAGCAGGAGCACG 660 

DTROSVQALKEVQDRAEQEH 176 

GCCGCTTGGAG6GCTGCCTGAAGGGCCTGCGCCTTGGCCACAAGTGCTTCCTGCTCTCGC 720 

GRLEGCLKGLRLGHKCFLLS 196 

GAGACTTCGAGACCCAGGCGGCGGCGCAGGCGCGGTGCAAGGCGCGAGGTGGGAGCTTAG 780 

RDFETQAAAQARCKARGGSL 216 

CACAGCCTGCGGACCGCCAGCAAATGGATGCGCTAAGCCGGTACTTACGCGCCGCTCTCG 840 

AQPADRQQMDALSRYLRAAL 236 

CCCCCTACAACTGGCCGGTGTGGCTGGGAGTGCACGATCGGCGCTCCGAGGGGCTCTACC 900 

APYNWPVWLGVHDRRSEGLY 256 

TTTTCGAGAACGGCCAGCGCGTGTCTTTCTTCGCCTGGCACCGCGCATTCAGCCTGGAGT 960 

LFENGQRVSFFAWHRAFSLE 276 

CCGGCGCCCAGCCTAGTGCGGCAACACATCCACTCAGCCCGGATCAGCCCAATGGCGGCG 1020 

SGAQPSAATHPLSPDQPNGG 296 

TCCTGGAGAACTGCGTGGCCCAGGCCTCAGACGACGGTTCTTGGTGGGACCATGACTGTG 1080 

VLENCVAQASDDGSWWDHDC 316 

AGCGGCGCCTCTACTTCGTCTGCGAGTTCCCCTTCTAGAGAACCGGTCTCTGCCCAGGAG 1 1 40 

ERRLYFVCEFPF* 328 

CTCTAGTGCACATTTTGCACCGTACACCGCGCACCCTATTGTTAGGGGCCTGGGAGTCGC 1200 

TCAGAGATTAAGCGTGACCATGAATACATTTTAATCAGAAGAGGTTTTTTATTTTAGATA 1260 

CTGGCACCCAGACTGATTGGGGCCAGGTGTGCTCCTGAGATTGCTTCCAAGATGCATTAT 1320 

CAGCCCAGGGATTTTAAAGGCAAACCCCACAAGATTGCATGTAGCCTGCTTACATGTAGG 1380 

CGGG AG C A T AAAA AT TT AA AATATAA AAAAA 1 41 1 

FIG. 1. Nucleotide and deduced amino acid sequence of the mu- 
rine SCGF cDNA. Numbering is from the 5' end of the nucleotide 
sequence (top) and from initiation methionine of the amino acid se- 
quence (bottom). An asterisk is a stop codon. 



RESULTS 

Isolation of Human, Mouse and Rat cDNAs 
Encoding SCGF 

We have previously found that a human myeloid cell 
line, KPB-M15, produces SCGF, and cloned a sole 
cDNA encoding 245 aa protein using expression cloning 
method (2). To obtain another transcripts of SCGF, RT- 
PCR analysis was performed with a template of single 
strand cDNA prepared from human bone marrow po- 
ly(A) RNA. Five clones were analyzed independently, 
and appeared the same nucleotide sequence. The re- 
sulting new cDNA had an open reading frame capable 
of encoding a protein of 323 aa, which was longer by 
78 aa than the previously isolated clone (2). Here we 
named the longer clone as hSCGF-a and the shorter 
one as hSCGF-/?. 



Homologous mSCGF cDNA (1411 bp; Figure 1) was 
amplified from a cDNA library prepared from the mu- 
rine preadipocyte cell line, MC3T3-G2/PA6, using PCR 
with hSCGF primers. The predicted coding region of 
mSCGF was 328 aa in length, and showed 85.1% aa 
identity and 90.4% aa similarity to the human homo- 
logue (hSCGF-a). Homologous rSCGF cDNA was am- 
plified from a single strand cDNA originated from rat 
osteosarcoma cell line, ROS-17/2.8-5, by the method of 
RT-PCR. Five clones were analyzed independently, and 
appeared the same nucleotide sequence. The predicted 
coding region of rSCGF was 328 aa in length, and 
showed 93.3% and 83.3% aa identity and 96.6% and 
90.4% aa similarity to the murine homologue and hu- 
man homologue, respectively. 

Hydrophilicity analysis of the human, murine and 
rat SCGF proteins indicated the presence of signal se- 
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- MDNQGVIYSDLNLPPNPKRQQRKPKGNKSSILATEQEITYAELNLQKASQDFQ6N 55 
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FIG. 2. Alignment of SCGF with members of the C-type lectin superfamily. The putative signal sequence is underlined. The different 
amino acids among hSCGF, mSCGF and rSCGF (solid circles), the amino acids absent in hSCGF-/3 (a line with arrows at both ends) and 
5 conserved Trps upstream to the second Cys from the C-terminal (cross mark) are denoted on the top. Polyglutamic acidic region is boxed 
and Pro/Ser/Thr-rich PT box is boxed with dots. RGD sequence is lettered in white with black background. A framework of conserved Cys 
or Trp is arranged in columns, and 4 conserved amino acids (VDYV) are circled in the consensus pattern of the CRD of C-type lectin motif 
sequence (doubly underlined). Abbreviations used are as follows: h; human, m; murine, r; rat, TN; tetranectin, CPCP; cartilage proteoglycan 
core protein, ASGR; asialoglycoportein receptor, MBP-C; mannose-binding protein-C, PSP; pulmonary surfactant associated protein, NKG; 
natural killer group, MMR; macrophage mannose receptor, PAP; pancreatitis-associated protein, MBP; major basic protein. 
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FIG. 3. Unrooted phylogenetic tree of the C-type lectin superfamily. Scale bar represents branch length corresponding with 0.1 amino 
acid substitution per site. Abbreviations used are as follows: p; pig, b; bovine, gp; guinea pig, c; chicken, CGN; conglutinin, CL; collectin, 
MASGP; macrophage asialoglycoprotein-binding protein, HL; hepatic lectin, PSP; pancreatic stone protein, NKR; natural killer cell receptor. 
Refer to the legend of Figure 2 for other abbreviations. 



quence and the absence. of putative transmembrane 
region. There was no potential N-linked glycosylation 
sites and only hSCGF contained RGD sequence at 
amino acid position 61 (Figure 2). 

Characterization of SCGF in Silico 

The computer-assisted homology search revealed 
that the primary structure of hSCGF-a, mSCGF and 
rSCGF snared a motif sequence designated as C-type 
lectin (Figure 2). The conserved domain, which seems 
to function as a calcium-dependent carbohydrate-rec- 
ognition domain (CRD) (16), is shown below. This do- 
main consists of about 110 to 130 amino acid residues 
containing six cysteines as well as other optional amino 
acids. Four cysteines are perfectly conserved and in- 
volved in two disulfide bonds, and two more cysteines 
are optionally conserved. A striking consensus pattern 
is C-[LIVMFATG]-x(5,12)-[WL]-x-[DNSR]-x(2)-C-x(5,6)- 
[FYWLIVSTA]-[LIVSTA]-C. Among the members of 
C-type lectin superfamily identified thus far, SCGFs 



except hSCGF-/?, showed the greatest homology with 
tetranectin (TN), a plasminogen kringle 4-binding pro- 
tein, especially with shark TN at a slightly higher score 
of homology than with TN of other species. hSCGF 
showed 32.2%, 27.2% and 33.7% aa identity, and 48.0%, 
46.5% and 53.6% aa similarity to hTN (17), mTN (18) 
and shark TN (19), respectively. SCGFs appeared close 
to a member of tetranectin, but formed a cluster sepa- 
rate from the other members of C-type lectin superfam- 
ily (Figure 3). 

Northern Analysis of Human SCGF mRNA 

Figure 4 shows Northern blot analysis of the expres- 
sion of hSCGF transcripts in hematopoietic tissues. 
The predominant RNA band in the blots had a size of 
1.4-1.6 kb. SCGF transcripts were seen in many human 
tissues, including spleen, thymus, appendix, bone mar- 
row and fetal liver, whereas the expression in periph- 
eral blood leukocytes was low. An additional transcript 
of 2.2-2.4 kb was detected in thymus and bone marrow. 
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FIG. 4. Northern blot analysis of SCGF transcripts. RNAs iso- 
lated from the human tissues were probed with labeled antisense 
RNA encoding human SCGF. Lane 1; spleen, lane 2; lymphnode, 
lane 3; thymus, lane 4; appendix, lane 5; peripheral blood leukocytes, 
lane 6; bone marrow and lane 7; fetal liver. Each position of RNA 
size markers is given in kb on the left. Arrows indicate SCGF tran- 
scripts of 1.4-1.6 kb and 2.2-2.4 kb. 



Chromosomal Mapping of SCGF Gene 

Chromosomes from peripheral blood and spleen lym- 
phocytes were FISH analyzed in order to determine 
where the SCGF gene resided in the human and mu- 
rine genome, respectively. Under the condition used, 
the hybridization efficiency was approximately 61% for 
hSCGF probe. The DAPI banding was used to identify 
the specific chromosome, and the assignment between 
the signal from probe and the long arm of chromosome 
19 was achieved. There was no additional locus picked 
by FISH detection, therefore hSCGF gene was located 
on human chromosome 19, region q 1 3.3 (Figure 5A and 
B). mSCGF gene was mapped on chromosome 7, region 
B3-B5 with the hybridization efficiency of 64% for 
mSCGF probe (Figure 5C and D). 

DISCUSSION 

In this paper, we have described the cloning and 
characterization of a full-length SCGF cDNA. Unlike 
SCF and flk-2/flt3 ligand, which contain transmem- 
brane region, any hydrophobic region other than N- 
terminal signal sequence is neither seen in hSCGF-a 
nor in hSCGF-/?, indicating that this protein exists in 
a soluble form. The newly isolated hSCGF-a is signifi- 
cantly different from the previously cloned hSCGF-/? 
in that 78 aa is inserted into the latter to form the 
former. Analysis of RT-PCR was carried out using 
mRNA prepared from human bone marrow and KPB- 
M15 cells, from the latter of which SCGF-0 cDNA was 
originally isolated. As far based on the same condition 
of amplification as used for cloning of SCGF-a cDNA, 



a band corresponding to SCGF-/? was undetectable 
(data not shown). Currently the reason of the rare ex- 
pression of SCGF-/J mRNA in KPB-M15 and human 
bone marrow cells is unclear. Further experiments to 
identify the genome DNA and how to regulate the gene 
expression of the two forms should be progressed. 

hSCGF-/? lacks a conserved CRD, whereas hSCGF- 
a, mSCGF and rSCGF have well conserved CRD, and 
are supposed to belong to C-type lectin superfamily; 
that is, the inserted 78 aa sequence occupies a most 
half of the CRD (Figure 2). According to the phyloge- 
netic tree of C-type lectin superfamily, shark TN is 
closer to SCGFs than human or murine TN. Since C- 
type lectins generally bind through CRD as a ligand 
or counter-receptor in a calcium-dependent manner to 
carbohydrates of their specific receptors (16), SCGF 
could interact with carbohydrates. While CD94/ 
NGK2A,B,C are another member of C-type lectin su- 
perfamily (Figure 3), carbohydrates found on HLA-E 
as their receptor are not required for binding, but form 
additional points of interaction (20). Another excep- 
tional example is a lectican; chondroitin sulfate proteo- 
glycans including aggrecan, versican, neurocan and 
brevican (Figure 3) bind tenascin-R by protein-protein 
interactions independent of carbohydrates (21). Since 
SCGF-/? lacks a half of CRD, the affinity in interaction 
with the receptor might be decreased. Although the 
activity of SCGF-a on colony-formation has not been 
measured yet, the existence of CRD can play a role to 
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FIG. 5. FISH mapping of SCGF gene. An example of FISH map- 
ping of hSCGF (A) and mSCGF (C) is shown; the FISH signals on 
chromosome (left panel) and the same mitotic figure stained with 
DAPI (right panel) to identify chromosome 19 (A) and chromosome 
7 (C). Diagram of FISH mapping results is shown for hSCGF (B) and 
mSCGF (D). Each dot represents the double FISH signals detected on 
chromosome. 
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enhance the activity. hSCGFs, both a and /3 forms, 
contain RGD sequence at amino acid position 61, 
whereas mSCGF and rSCGF do not, indicating that 
little, or if any, interaction with adhesion molecules 
would occur at this site. 

Northern blot analysis carried out previously using 
tissues and cell lines indicates that SCGF functions 
within the hematopoietic microenvironment (2). The 
present data are compatible with the previous ones; 
SCGF mRNA is expressed at a high level within the 
hematopoietic tissues, particularly bone marrow. Pe- 
ripheral blood leukocytes show a low content of SCGF 
mRNA, which could be ascribed not to lymphocytes but 
to contaminated monocytes in the cell population. In 
fact, the expression of SCGF mRNA was determined 
using monocytoid cell lines (2). Therefore, it is likely 
that the receptor for SCGF is expressed on primitive 
hematopoietic progenitor cells, which, when identified, 
should provide valuable insight into the biological func- 
tion of SCGF. 

SCGF gene is located most likely on chromosome 
19ql3.3 for hSCGF and chromosome 7B3-B5 for 
mSCGF, Among hematopoietic growth factors, flk-2/ 
flt3 ligand (22) and IL-1 1 (23) genes are located on the 
same chromosomal region for both human and murine 
genes, while hTN gene is found on chromosome 3q22- 
p2 1 .3 (24). Multiple human genes encoding hematopoi- 
esis-related growth factors or receptors, e.g. GM-CSF, 
M-CSF receptor, IL-3, IL-4, IL-5, IL-9, IL-12B, IL-13, 
fibroblast growth factor (FGF) 1 , FGF receptor 4, plate- 
let-derived growth factor receptor and fltA are clustered 
on the long arm of chromosome 5, implicating that they 
are involved in the pathogenesis of acute lymphoblastic 
leukemia with karyotype t(5; 14)(q31;q32) (25), malig- 
nant histiocytosis with karyotype t(2;5)(p23;q35) (26) 
and 5q- syndrome (27, 28); in the last case, clonal evolu- 
tion with karyotype del(5) (q 1 3q35) is associated with 
acute nonlymphoblastic leukemia or myelodysplastic 
syndrome. Two causative genes on chromosome 1 9q 1 3 
have been reported for hematological diseases; congeni- 
tal hypolastic (Diamond-Blackfan) anemia (DBA) (29) 
and bcl-Z (30) on 1 9q 1 3.2. Translocation occurs be- 
tween immunoglobulin heavy chain gene and bcl-3 in 
a certain case of chronic lymphocytic leukemia with 
karyotype t(14; 1 9) (q32 ; q 1 3). SCGF gene could be one 
of the candidate genes involved in certain types of he- 
matopoietic disease, and it would be intriguing to ex- 
amine karyotype abnormalities in 19ql3.3 or SCGF 
gene mutation in such diseases. 
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